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i Abstract. Let T n denote the set of unrooted labeled trees of size n and let JVl be a particular 

■ (finite, unlabeled) tree. Assuming that every tree of T n is equally likely, it is shown that the 
^ limiting distribution as n goes to infinity of the number of occurrences of M as an induced 

^v^j ' subtree is asymptotically normal with mean value and variance asymptotically equivalent to an 

and <7 2 n, respectively, where the constants (i > and a > are computable. 

lO ■ 1. Introduction 

In this paper we consider unrooted labeled trees and analyse the number of occurrences of a 
tree pattern as an induced subtree of a random tree. It is well known that a typical tree in T n , the 
set of unrooted labeled trees of size n, has about /U^n nodes of degree k, where fj,f. = l/e(k — 1)!. 
Moreover, for any fixed k the total number of nodes of degree k over all trees in T n satisfies a 
central limit theorem with mean and variance asymptotically equivalent to fi^n and a\n (for a 
specific constant at > 0). See |DG99| . where Drmota and Gittenberger explored this phenomenon 
for unrooted labeled trees and other types of trees. 

A node of degree k is an occurrence of what can be called a star with k edges. In this paper 
' we continue this idea. We consider a pattern M., a given finite tree, and compute the limiting 

distribution of the number of occurrences of M. in T n as n — > oo. Note also that there can be 
overlaps of two or more copies of M. , which we intend to count as separate occurrences. 
, Our main result in this paper is: 

O . 

^sD ■ Theorem 1. Let M be a given finite tree. Then the limiting distribution of the number of 

occurrences of M. (as induced subtrees ) in a tree of T n is asymptotically normal with mean and 

■ variance asymptotically equivalent to \in and a 2 n, respectively, where /i > and a 2 > depend 
on the pattern M. and can be computed explicitly and algorithmically and can be represented as 
polynomials (with rational coefficients) in 1/e. 

We consider here a random variable X as Gaussian if its characteristic function is given by 
Ee ,a = e %tit ~ a * / 2 , that is, the case of zero variance cr 2 = is included here. For example, 
if M. consists just of one edge (and two nodes), then the number of occurrences of M. in T n is 
ri—l and thus constant. So in that particular case we have /z = 1 and a 2 = 0. Nevertheless we 
conjecture that a 2 > in all other cases. 

As already mentioned, the case of stars (or nodes of given degree) has been discussed in DG99 
for various classes of trees. Some previous work for unlabeled trees is due to Robinson and Schwcnk 
RS75;. Patterns in (rooted planar) trees have also been considered by Dershowitz and Zaks DZ89 
under the limitation that patterns start at the root. In a work on patterns in random binary search 
trees, Flajolet, Gourdon, and Martinez |FGM97) obtained a central limit theorem. Flajolet and 
Steyaert also analysed an algorithm for pattern matchings in trees FS80a, FS80b, SF83 . Further 
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Rucihski |Ruc88j established conditions for when the number of occurrences of a given subgraph 
in random graphs follows a normal distribution. 

The plan of the paper is as follows. In Section [5] we give a short introduction to counting 
trees with generating functions, and also expand this to two variables for counting stars (nodes of 
specific degree k) in trees. In Section [3| we expand this framework to the counting of patterns in 
trees. The resulting asymptotics are presented in Section^ concluding the proof of Theorem ^ 
Technical details for this as well as explicit algorithms can be found in the appendix. In fact, the 
algorithmic aspect is one of the driving forces of this paper. 

2. Counting Trees and Counting Stars in Trees 

In this section we introduce a three-step program to count the number of trees in T n and in 
the same fashion the number of occurrences of nodes of degree k in 7~ n . While redundant and 
probably heavy in this simplistic situation, this procedure was crucial to the derivation in DG99 
for counting stars and will generalise well to our setting of general tree patterns. 

For this purpose we make use of the sets lZ n of rooted labeled trees of size n and V n of planted 
labeled trees of size n. For rooted and unrooted trees, the size n counts the total number of nodes, 
whether internal or at the leaves. On the other hand, a planted tree is just a rooted tree where 
the root is adjoined an additional "phantom" node which does not contribute to the size of the 
tree, whereas the degree of the root is increased by one. As well, one can think of a planted tree 
as a rooted tree with an additional edge having no end vertex. The advantage of using planted 
trees, though it seems to add complexity, will be explained below. Obviously \V n \ = \7Z-n\ and 
\%i\ = \TZ.n\/n- It is also well known that \lZ n \ = n n ~ 1 and \T n \ = n n ~ 2 . 

The three-step program is the following one: First, the generating function enumerating planted 
trees is determined, then it is used to count rooted trees by deriving their generating function, 
and finally the generating function counting unrooted trees is computed. 

We define 

p(x) = ^2\V n \-, r(x) = ^\TZ n \-, *(z) = £|7;|- 

n—0 n= n— 

and proceed in the following way: 

(1) Planted Rooted Trees: A planted tree is a planted root node with zero, one, two, . . . 
planted subtrees of any order. In terms of the generating function this yields 

p{x) = y^]_ =xeV{x) 

n=0 

(2) Rooted Trees: For rooted trees we get the same (except for the phantom nodes which 
are not present here), just a root with zero, one, two, . . . planted subtrees of any order 

r(x) = J2 = xeP(x) = PW- 

n=0 U ' 

(3) Unrooted Trees: Finally, we have | T n \ = \TZ n \/n, as already mentioned. However, we can 
also express t(x) by a relation which follows from a natural bijection between rooted trees 
on the one hand and unrooted trees and pairs of planted rooted trees (that are joined by 
identifying the additional edges at their planted roots and discarding the phantom nodes) 
on the other hand. 1 This yields 

t(x) = r{x) - ^p(x) 2 . 



Consider the class of rooted (labeled) trees. If the root is labeled by 1 then consider the tree as an unrooted 
tree. If the root is not labeled by 1 then consider the first edge of the path between the root and 1 and cut the tree 
into two planted rooted trees at this edge. 
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The functional equation for p(x) can be either used to extract the explicit number \V n \ — n n ~ x 
via Lagrange inversion or to obtain the radius of convergence and asymptotic expansions of the 
singular behaviour of this function. It is well known that xq = 1/e is the common radius of 
convergence of p(x), r(x), and t(x), and that the singularity at x = xq is of square-root type: 

2 

p(x) = r(x) = 1 - V2\/I - ex + -(1 — ex) H , 

t(x) = \ - (1 - ex) + ^(1 - exf' 2 + ■ ■ ■ . 
This is reflected by the asymptotic expansions of the numbers 

V ZTT 



\T n \=n n - 2 ~^Le n n-^. 

V Z7T 

In order to demonstrate the usefulness of the three-step procedure above we repeat the same 
steps for counting stars with k edges in trees, that is, the number of nodes of degree fc, a given 
fixed positive number. Let p n ,m denote the number of planted trees of size n with exactly m 
nodes of degree k. Furthermore, let r n ^ m and t n ^ m be the corresponding numbers for rooted and 
unrooted trees and set 

°° „n m °° m 00 m 

P{X,U)= p n ^ m — , r{X, U) = ^ r n,m — , t{X,U)= ^ t n , m — . 

n,m— n,m— n,m— 

Then we have (compare with [DG99 ) 
(1) Planted Rooted Trees: 



p(x,u)= 

1 i 

(2) Rooted Trees 



n=0 
nytk-1 



xp(x,u) n xup(x,u) k 1 p(x,u) x(u — l)p(x,u) k 1 
n\ + (fc-1)! ' + (k - 1)! ' 



DC- 



r( T ,A - V + X Mx,u) k _ {x u) X(u-l)p(x,u) k 

n=0 

(3) Unrooted Trees: Similarly to the above we have t nyTn = r nyTn /n which is sufficient for 
our purposes. However, as above, it is also possible to express t(x, u) by 

t(x,u) = r(x,u) - ^p(x,u) 2 . 

Note that the use of the notion of planted trees is crucial in order to keep track of the nodes of 
degree k by means of the recursive structure of planted trees. In DG99 this approach was used 
to show that the asymptotic distribution of the number of nodes of degree k in trees of size n is 
normal, with expectation and variance proportional to n. 

3. Counting Patterns in Trees 

We now generalize the counting procedure of Section [2] to more complicated patterns. For our 
purpose, a pattern is a given (finite unrooted unlabeled) tree M. To ease explanations, we will 
use as M. the example graph in Figure Q 

We say that a specific pattern M. occurs in a tree T if A4 occurs in T as an induced subtree in 
the sense that the node degrees for the internal (filled) nodes in the pattern match the degrees of 
the corresponding nodes in T, while the external (empty) nodes match nodes of arbitrary degree. 2 



2 More generally we could also consider pattern-matching problems for patterns in which some degrees of certain 
possibly external "filled" nodes must match exactly while the degrees of the other, possibly internal "empty" nodes 
might be different. But then the situation is more involved, see Section 151 
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Figure 1. Example pattern 

Because the results for the patterns consisting of only one node or two nodes and one edge are 
trivial, we now concentrate on patterns with at least three nodes. 

Our principal aim is to get relations for the generating functions which count the number of 
occurrences of a specific pattern A4. Let p n , m denote the number of planted rooted trees with 
n nodes and exactly m occurrences of the pattern M. and let 

. . ^ x n u m 
P = P(X,U)= > p rhm j— 

*■ — ' n! 

n,m— 

be the corresponding generating function. 

3.1. Generating Functions for Planted Rooted Trees. 

Proposition 1. (Planted Rooted Trees,) Let M. be a pattern. Then there exists a certain 
number L + 1 of auxiliary functions Oj(x, u) (0 < j < L) with 

L 

p(x, u) = a,j (x, u) 

3=0 

and polynomials Pj{y$, . . . ,yi,,u) (1 < j < L) with non-negative coefficients such that 

L 

ao(x, u) = xe Mx,u)+-+a L ( x ,u) -xY^Pjiaaix, «),..., a L (x, u), 1) 

i=i 

(1) Oi(x,u) = x ■ Pi(a (x,u), . . . ,a L (x,u),u) 



a L (x,u) = x ■ P L (a (x,u), . . . ,a L (x,u),u). 



Furthermore, 



L 



]rp,(2/ ,...,y L ,i) < ce y°+-+y\ 

i=i 

where f < c g means that all Taylor coefficients of the left-hand side are smaller than or equal 
to the corresponding coefficients of the right-hand side. Moreover, the dependency graph of this 
system is strongly connected. 3. 

The proof of this proposition is in fact the core of the paper. In order to make the arguments 
more transparent we will demonstrate them with the help of the example pattern in Figure ^ At 
each step of the proof we will also indicate how to make all constructions explicit so that it is 
possible to generate System Q effectively. 

In a first step we introduce the notion of a planted pattern. A planted pattern M p is just 
a planted rooted tree where we again distinguish between internal (filled) and external (empty) 
nodes. It matches a planted rooted tree from T n if M. p occurs as an induced subtree starting from 
the (planted) root, that is, the branch structure and node degrees of the filled nodes match. Two 
occurrences may overlap. For example, in Figure [21 the planted pattern A4 P on the left matches 
the planted tree A twice (following the left, resp. the right edge from the root), but B not at all. 



■^The notion of dependency graph is explained in Appendix [B] and intuitively speaking, reflects the fact that no 
subsystem can be solved before the whole system. 




Also remark that, notwithstanding the symmetry of C, the pattern M. p really matches C twice, 
as we are interested in matches in labeled trees. 

We now construct a planted pattern for each internal (filled) node of our pattern M. which is 
adjacent to an external (empty) node. The internal (filled) node is considered as the planted root 
and one of the free attached leaves as the plant. In our example we obtain the two graphs in 
Figure El 

The next step is to partition all planted trees according to their degree distribution up to some 
adequate level. To this end, let D denote the set of out-degrees that occur in the planted patterns 
introduced above and h be the maximal height of these patterns. In our example we have D = {2} 
and h = 3. For obtaining a partition, we more precisely consider all trees of height less than or 
equal to h with out-degrees in D. We distinguish two types of leaves in these trees, depending on 
the depth at which they appear: leaves in level h, denoted "o" , and leaves at levels less than h, 
denoted For our example we get 11 different trees clq, at, . . . , aio, depicted on Figure^] 

These trees induce a natural partition of all planted trees for the following interpretation of the 
two types of leaves: We say that a tree T is contained in class 4 a? if it matches the finite tree (or 
pattern) aj in such a way that a node of type □ has degree not in D, while a node of type o has 
any degree. For example, ao corresponds to those planted trees where the out-degree of the root 
is not in D. 

It is easy to observe that these (obviously disjoint) classes of trees form a partition. Indeed, 
take any rooted tree. For any path from the root to a leaf, consider the first node with out-degree 
not in D, and replace the whole subtree at it with □. Then replace any node at depth h with o. 
The tree obtained in this way is one in the list. 

Furthermore, the classes above can be described recursively. To this end, it proves convenient 
to introduce a formal notation to describe operations between classes of trees: © denotes the 
disjoint union of classes; \ denotes set difference; recursive descriptions of tree classes are given 
in the form <jj = sa^ 1 • • • a^ 1 , to express that the class a; is constructed by attaching e\ subtrees 
from the class aj 1 , en subtrees from the class o j2 , etc, to a root node that we denote x. 



By abuse of notation the tree class corresponding to the finite tree aj is denoted by the same symbol aj . 
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In our example we get the following relations: 

10 10 oo , 10 

i=l i=0 n= 3 ^i=0 



C1 1 


= TO^ 








a 2 


= xaoai, 








a 3 


= xa (a2 


ffla 3 


® o 4 ), 




a 4 


= xa (a 5 


© a 6 


ffi a 7 ® 


a 8 ffi a 9 ffi aio), 


a 5 


= xa\, 








a 6 


= xa 1 (a 2 


ffia 3 


a 4 ), 




a 7 


= xai{a§ 


® o 6 


© a 7 e 


a 8 ffi a g ffi aio), 


as 


= x(a 2 ffi 


a 3 ffi 


a 4 ) 2 , 




ag 


= x(a 2 © 


a 3 ffi 


a 4 )(o 5 ( 


B a 6 ffi a 7 ffi a 8 ffi a 9 ffi aio) 


aio 


= x(a 5 © 


a 6 ® 


a 7 © a 8 


ffi a 9 ffi aio) 2 . 



This is to be interpreted as follows. Trees in ai consist of a (planted) root that is denoted by x that 
has out-degree 2, and two children that are of out-degree distinct from 2, that is, in ao- Similarly, 
trees in a 3 consist of a root x with out-degree 2 and subject to the following additional constraints: 
one subtree at the root is exactly of type ao; the other subtree, call it T, is of out-degree 2, either 
with both subtrees of degree other than 2 (leading to T in a 2 ), or with one subtree of degree 2 
and the other of degree other than 2 (leading to T in a 3 ), or with both of its subtrees of degree 2 
(leading to T in class a 4 ). Summarizing: a 3 = xao(a 2 ffi a 3 ffi a 4 ). Of course this can be also 
interpreted as a 3 = xa^a 2 ffi xaoa 3 ffi xaoa 4 . Another more involved example corresponds to a 8 ; 
here both subtrees are of the form a 2 ffi a 3 ffi a 4 . 

To show that the recursive description can be obtained easily in general, consider a tree aj 
obtained from some planted pattern M p . Let si, . . . , Sd denote its subtrees at the root. Then, 
in each Sj, leaves of type o can appear only at level h — 1. Substitute for all such o either □ or 
a node of out-degree chosen from D and having o for all its subtrees. Do this substitution in all 














a4 ^ A 








a 8 i 

4^ 


4iX 


c/oaXX 



Figure 4. Tree partition 
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possible ways. The collection of trees obtained are some of the a^s, say a, y) , a, yj , etc. Thus, we 
obtain the recursive relation a,j = x{a k <r> © ® •••)•• • (a,(<i) ffi <2,(<r) ® • • • ) for aj. 

In general, we obtain a partition of L + 1 classes ao, . . . , ax, and corresponding recursive de- 
scriptions, where each tree type Oj can be expressed as a disjoint union of tree classes of the kind 



(2) 



where r denotes the degree of the root of aj and the non-negative integer li is the number of 
repetitions of the tree type a^. 

We proceed to show that this directly leads to a system of equations of the form (JJJ, where 
each polynomial relation stems from a recursive equation between combinatorial classes. 

Let Aj be the set of tuples (Iq, . . . , II) with the property that (Iq, ■ • • , II) £ Aj if and only if 
the term of type J2J is involved in the recursive description of aj (in expanded form). Further, 
let k = K{Iq, . . . , II) denote the number of additional occurrences of the pattern in (J2J in the 
following sense: if b — ra 3l • • • a,j r and T is a (planted rooted) labeled tree of b with subtrees 
T\ G aj 1 , T2 € <2j 3 , etc, and M. occurs m\ times in T\, mi times in T2, etc, then T contains M. 
exactly m\ + + • • • + md + k times. The number k corresponds to the number of occurrences 
of M. in T in which the root of T occurs as internal node of the pattern. By construction of the 
classes ai this number only depends on b and not on the particular tree T € b. Let us clarify the 
calculation of k = K(Iq, . . . , II) with an example. Consider the class 09 of the partition for the 
example pattern. Now, in order to determine the number of additional occurrences, we match the 
planted patterns of Figure at the root of an arbitrary tree of class ag. The left planted pattern 
of Figure 13 matches three times, the right one matches once. Thus we find that in this case k = 4. 
For the other classes we find the following values of k = K(Iq, . . . , 



Terms of class 


a 


oi 


a-2 


«3 


0,4 


a 5 


a 6 


a 7 




a 9 


aio 


Value of k 











1 


2 


1 


2 


3 


3 


4 


5 



Now define scries Pj by 

Pj(y ,...,y L ,u)= TT^lj/ v '"' y L 



l o ... 1i Il„.K(1 ,...,Il) 



(Zo,...,ii)eAj 



These are in fact polynomials for 1 < j < L by the finiteness of the corresponding Aj . All matches 
of the planted patterns are handled in the Pj, 1 < j < L, thus 



P (vo, ■■■,VL,u) = e «°+-+« -J^Pjiyo: ...,VL,l) 

3=1 

does not depend on u. 

In our pattern we get for example for Ps(yo, ■ ■ ■ , Via, u) 

Ps(yo, ■ ■ -,Vio,u) = ^xy2u 3 +xy 2 y3u 3 +xy2y4u 3 +^xylu 3 +xy 3 yiu 3 +^xyju 3 = ^x(y 2 +y3+yi) 2 u 3 . 

Finally, let aj- n , m denote the number of planted rooted trees of type aj with n nodes and m oc- 
currences of the pattern M. and set 



a 



(x,u)= Y 



n,m— 



nl 



By this definition it is clear that 

aj(x, u) — x ■ Pj (ao(x, u), . . . , ol(x, u), u), 

because the size of labeled trees is counted by x (exponential generating function) and the oc- 
currences of the patterns is additive and counted by u. Hence, we explicitly obtain the proposed 
structure of the system of functional equations Q . 



8 



FREDERIC CHYZAK*, MICHAEL DRMOTA", THOMAS KLAUSNER", AND GERARD KOK" 



For the example pattern we arrive at the following system of equations, where we denote the 
generating function of the class aj by the same symbol a^. 

10 10 oo . / 10 \ n 

a Q = a (x, u) = p - ^ = x + x a, + x ^ — I ^ a,; J , 

i—l i=0 n=3 ' \i=0 / 

1 



ai = ai(x,u) = -xa , 



j:r 2 

2 

0.2 = 02(x,u) = xa 9 ai, 

a 3 = a 3 (x, u) — xao(a 2 + a 3 + a,i)u, 

04 = a4,(x, u) = xao(ar-, + ag + a 7 + ag + ag + aio)w 2 , 

a 5 = 05(1, u) = \-xa\u, 

a e = a §(x, u) — xa-i (a 2 + a 3 + a^ju 2 , 

a 7 = a 7 (x, u) = xa\ (a 5 + a 6 + 07 + a s + a 9 + ai )u , 

a 8 = a s (x, u) = ^x(a 2 + a 3 + a 4 ) 2 u 3 , 

a 9 = a 9 (x, u) = x{a 2 + a 3 + a 4 )(a 5 + a 6 + a*? + a s + a 9 + a w )u 4 , 
aw = a w (x, u) = 7jx{a 5 + a 6 + a 7 + a$ + a 9 + a 10 ) 2 u 5 . 

In order to complete the proof of Proposition ^ we just have to show that the dependency 
graph is strongly connected. By construction, a — a (x, u) depends on all functions ai = ai(x, u). 
Thus, it is sufficient to prove that every a, (1 < i < L) also depends on a Q . For this purpose 
consider the subtree of Ai that was labeled by a, and consider a path from its root to an empty 
node. Each edge of this path corresponds to another subtree of M , say a i2 , a is , . . . , a ir . Then, by 
construction of the system of functional equations above, <ij depends on <Zj 2 , ai 2 depends on at 3 
etc. Finally the root of is adjacent to an empty node and thus (the corresponding generating 
function) depends on a 9 . This completes the proof of Proposition Q 

Note that we obtain a relatively more compact form of this system by introducing 
b = bo(x,u) = ao(x,u), 
b\ = b\(x,u) = ai(x,u), 

b 2 = b 2 (x, u) = a 2 (x, u) + a 3 (x, u) + a 4 (x, u) 

63 = 6 3 (.t, u) = a 5 (x, u) + a 6 (x, u) + a 7 (x, u) + a s (x, u) + a 9 (x, u) + a 19 (x, u), 
together with the recursive relations 

b = . Te b0 + b 1 +f2+6 3 _ 1 x(6q +h+b2+ fe 3 )2 ; 

b 2 = xb bi + xb b 2 u + xb b 3 u 2 , 

6 3 = —xbfu + xbib 2 u 2 + xbib^u 3 + —xb 2 u 3 + +xb 2 b 3 u i + — xb 2 u 5 . 

The combinatorial classes corresponding to the bi (which we will also denote by 6j) have the 
interpretation shown in Figure |S] We could have obtained the classes bi directly by restraining 
the construction to a maximal depth h — 1 instead of h. In principle, we could then apply the 
analytic treatment of Section 21 to the system of the bi. However we feel that the existence of 
a recursive structure of the system of the bi with a well-defined K{1 9 , ..,Il) for each term in the 
recursive description is slightly less clear. Therefore we preferred to work with the ai which have 
a well-defined K (ai). In Appendix 1X1 we will discuss another algorithm that yields in general even 
more compact systems of equations. 
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b 




# 3 # 



do o*b 



Figure 5. The classes corresponding to the bi of equations Q 



3.2. From Planted Rooted Trees to Rooted and Unrooted Trees. The next step is to find 
equations for the exponential generating function of rooted trees (where occurrences of the pattern 
are marked with u). As above we set 



*(x, u) 



V r - 

n,m— 



where r n ^ rn denotes the number of rooted trees of size n with exactly m occurrences of the pat- 
tern M.. (That is, occurrences of the rooted patterns M. r deducible from M.. Here, a rooted 
pattern is defined in a very similar way as a planted pattern.) 

Proposition 2. (Rooted Trees,) Let M. be a pattern and let 

a (x,u), . . .,a L (x,u) 

denote the auxiliary functions introduced in Proposition Q Then there exists a polynomial 
Q(yo, . . ■ , tjl, u) with non-negative coefficients satisfying Q(yo, ■ ■ ■ , Vl, 1) <c e acH VL , and such 
that 

(4) r(x, u) = G(x, u, ao{x, u), . . . , ol(x, u)) 

for 



(5) 



G(x,u,y , . . . ,y L ) = x (e 



SoH Yvl 



Q(yo, ■■■,vlA) + Q(yo, ■ ■ -,yL,u)) 



Proof. The proof is in principle a direct continuation of the proof of Proposition ^ We recall 
that a rooted tree is just a root with zero, one, two, . . . planted subtrees, i.e., the class of rooted 
trees can be described as a disjoint union of classes c of rooted trees of the form xaj 1 ■ ■ •a Jd . 
Furthermore, let U denote the number of classes in this term such that c = xa l $ ■ ■ ■ a l £ , and set 
K(l , ■ ■ ■ Jl) to be the number of additional occurrences of the pattern M.. This number again 
corresponds to the number of occurrences of M. in a (rooted) tree T £ c in which the root of T 
occurs as internal node of the pattern. Set 

1 



Q d (yo,-..,yL,u)= J2 ,<..., ^ 

lo+~+lL=d °' L ' 



Then by construction 



~(x, u) 



x^Q d {a (x,u), . . .,a L (x,u),u) 

d>0 

oVa-\ hyz. 



Note that S d>0 Qd(yo, ■ • ■ 1) = e Va+ "' +VL . Let D denote the set of de grees of the internal 
(filled) nodes of the pattern, that is, D = { d + 1 : d <E D }; then Qd(yo, ■ ■ ■ , Vl, u) does not depend 
on u if d £ D. With 

Q(yo, ■ ■ ■ ,vl,u) := ^2 Qd(yo, ■ ■ -,yL,u), 

we obtain 0} and The number K(lo, ■ ■ ■ , II) is well-defined for a similar reason as was 

K (Zo, . . . Jl), and can be calculated similarly. □ 

We again illustrate the proof with our example. In Figure El the corresponding rooted patterns 
are shown. For convenience let ro = ro(x, u) denote the function 



in 
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Figure 6. Rooted patterns for the pattern in Figure H 

where p = ag + • ■ ■ + a\Q. The function r$ might also be interpreted as a catch-all function for 
the "uninteresting" subtrees — just a root x with an unspecified number of planted trees attached, 
except the ones we handle differently, namely the cases d e D = {3}. The generating function 
r = r(x, u) for rooted trees is then given by 

r = r + ^xb 3 + ^x J2 blbm^ + ^x ^ bobibjU*^' 1 + -x hbjb k u i+j+k 

' l<i<3 l<ij<3 l<ij",fc<3 

where the bi are defined in J2J. 

As above we have t n ^ m = r nim /n, where t niTn denotes the number of unrooted trees with n nodes 
and exactly m occurrences of the pattern M,. This relation is sufficient for our purposes. It is 
also possible to express the corresponding generating function f(x, u). In a way similar as before, 
we can define the number of additional occurrences K(i,j) of the pattern M. that appear by 
constructing an unrooted tree from two planted trees of the class a, and a,j by identifying the 
additional edges at their planted roots and discarding the phantom nodes. For our example we 
get 

t(x, u) = r(x,u) - 2^( x > u ) 2 ~ o 51 ^(x^^jix^)^^^ 2 - 1). 

l<»,j<3 

4. Asymptotic Behavior 

Since we arc not interested in the actual number of occurrences of the pattern, but only in its 
asymptotic behavior, we do not have to compute explicit formulae from the system of equations. 
Instead, we apply a result slightly adapted from [Drm97j which we state and discuss in AppcndixlBl 
In fact, it is immediately clear that Theorem [3 in this appendix, whose object is the proof of 
Gaussian limiting distributions, applies to the kind of problem we are interested in: the assertions 
of Propositions ^and |3 exactly fit the assumptions of Theorem |3 

The only missing point is the existence of a non- negative solution (xq,slq) of the system 

(6) a = F(z,a,l), 

(7) = det(I-F a (x,a,l)), 

where ^ is the system of functional equations of Proposition ^ and F a is the Jacobian matrix 
of F. Since the sum of all unknown functions p(x, u) is known for u = 1: 

n n - x — = 1 - v^Vl - ex + ■ ■ ■ , 

n>l 

it is not unexpected that xq = 1/e. 

Proposition 3. There exists a unique non-negative solution (xo,ao) of System for which 

xa = 1/e and the components of slq are polynomials (with rational coefficients) in 1/e. 

Proof. For a proof, set u = 1 and consider the solution a(x, 1) = (clq(x, 1), . . . , aj,_i(a;, 1)). Since 
the dependency graph is strongly connected it follows that all functions aj{x, 1) have the same 
radius of convergence which has to be xq — 1/e, and all functions are singular at x = xq. Since 
< aj(x, 1) < p(x, 1) < oo for < x < xq it also follows that aj(xo,l) is finite, and we 
have a(xo , 1) = F (xo , a(xo ,!),!)• If we had the inequality det (I — F a (xo , a(xo , 1 ) , 1 ) ) ^ then the 
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implicit function theorem would imply the existence of an analytic continuation for aj(x, 1) around 
x = xq, which is, of course, a contradiction. Thus, the determinant is zero and system (JSJQ has 
a unique solution. 

To see that the components ao,...,a,L (with 3j = <Zj(l/e, 1)) of ao are polynomials in 1/e 
we will construct the partition A = {ao,ai, . . . , a^} on which the system of equations is 
based by refining step by step the trivial partition consisting of only one class p. The recursive 
description of this trivial partition is given by the formal equation p = xJ2i>oP z - Additionally, 
the solution of the corresponding equation p — xexp(p) for the generating function p (denoted 
by the same symbol p) is given by (xo,p) — (1/e, 1), with p clearly a (constant) polynomial in 
1/e. Now let D — {di, . . . ,d s } (s g N) again denote the set of out-degrees that occur in the 
planted patterns. We will refine p by introducing for each (f,efla class at consisting of all trees 
of root out-degree di, as well as a class ao for trees with root out-degree not in D. The partition 
{ao, ai,...,a s } has the recursive description 

a ° = x X/ ( a o © a i © ' ' ' © a s) J , 

j£N\D 

(8) a,=a:(ooffioiffi"'ffifl s )' i ' (i = l,...,s), 

and the solution of the corresponding system of equations 

oo(a;,l)=a; — (00(1, 1) + ai(x, 1) H ha s (i:,l)) J 

jefi\D ■'' 



xe a (x,X)+-+a 3 {x,l) _ x J2_L( ao ( Xi I) + ai {x, !) + ■■■+ a s (x, l)) d « 



i=l 



(9) =xe^-xJ2^p(x) di 



i=l 

ai(x, 1) = ^ {a (x, 1) + ai(x, 1) H h a s (x, l)) d * = jjP(x) di (* = 1, ■ ■ ■ , s), 

is given by 

(10) x = l/e, d 'i = ~rr (* = 1 5 ---i s ); a = l-(oiH ho,), 

di . e 

thus again polynomials in 1/e. We continue by refining this last partition by introducing classes 
ci , . . . , c m (for some m G N) for each term at the right-hand side of ||SJ after expanding the 

"multinomial" . Such a class Cj is of the form cj — xa^ ■ ■ ■ a£ with natural numbers l\ , i = 
0, . . . , s. We get a new partition {ao, ci, . . . , c m } which has a recursive description by construction 
(because we can replace the a^ by disjoint unions of certain Cj). The corresponding system of 
equations for the generating functions is given by 

C A X A) = Tjy-a (a;,l) ai(x,l) 1 ■■■a s {x,l)° (j = l,...,u) 

£q . t-^ . • • • L s 

and consequently we have for xq = 1/e the solution 

1 1 

with the Si of ^lUp. Thus the Cj are again polynomials in 1/e. By continuing this procedure until 
level h (i.e., performing the refinement step h times) we end up with the partition A and we see 
that the solution for the corresponding system of equations consists of polynomials in 1/e, which 
completes the proof of Proposition □ 

Note that there is a close link with Galton- Watson branching processes. Let pk = denote a 
Poisson offspring distribution. Now we interpret a class a% as the class of process realizations for 
which the (non-planar) branching structure at the beginning of the processes corresponds to the 
root structure of cii. Then di = Oi(l/e, 1) is just the probability of this event. 
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We now solve the system of equations obtained for the example pattern. We have x — 1/e. The 
components of ao can easily be obtained by following the construction of the proof of Proposition^] 
(or we use the branching process interpretation). For example, if we set p = l/(2e) for the 
probability of an out-degree 2 and q = 1 — p then we get 0,4 — 04 (1/e, 1) = 2qp 3 = 2 ^~J, . The 
factor 2 comes from the fact that the two subtrees of the root may be interchanged, see Figure 
The other classes can be treated similarly and we find: 



(11) 



p(l/e,l) = 


1, 




a 5 (l/e,l) 


= (2e- 


l) 4 /(128e 7 ), 


a (l/e, 1) = 


(2e- 


l)/(2e), 


a 6 (l/e, 1) 


= (2e- 


l) 3 /(32e 7 ), 


ai(l/e,l) = 


(2e- 


l) 2 /(8e 3 ), 


a 7 (l/e,l) 


= (2e- 


l) 2 /(64e 7 ), 


a 2 (l/e,l) = 


(2e- 


l) 3 /(16e 5 ), 


a 8 (l/e, 1) 


= (2e- 


l) 2 /(32e 7 ), 


a 3 (l/e,l) = 


(2e- 


l) 2 /(8e 5 ), 


a 9 (l/e, 1) 


= (2e- 


l)/(32e 7 ), 


a 4 (l/e, 1) = 


(2e- 


l)/(16e 5 ), 


aio(l/e, 1) 


= l/(128e 7 ). 



We are now ready to complete the proof of the main part of Theorem^ By Propositions we 
can apply Theorem |2] and it follows that the numbers r n _ m have a Gaussian limiting distribution 
with mean and variance which are proportional to n. Since t n ^ m = r n>m /n we get exactly the same 
law for unrooted trees. It remains to compute n and a 2 . 

By using the procedure described in Appendix IB"! we get for our expample pattern 



8e 3 



= 0.0311169177. 



and 



20e 3 + 72e 2 + 84e - 175 
32^6 



0.0764585401 . 



We observe — as predicted by Theorem ^ — that both [i and a 1 can be written as rational polyno- 
mials in 1/e. 

In what follows we will prove this fact (which completes the proof of Theorem^) and also present 
an easy formula for /1. Unfortunately the procedure for calculating a 1 is much more complicated 
so that it seems that there is no simple formula. 

Proposition 4. Let xq = 1/e and slq be given by Proposition^ and let Pj(y,u) (1 < j < L) be 
the polynomials of Proposition^ with y = (yo, . . . , y^). Then fJ- (of Theorem^) is a polynomial 
in 1/e with rational coefficients and is given by 



(12) 



/i 



1 L 

IE 



OP, 



e ' — ' du 

3=1 



(ao,l). 



Proof. Let a = F(x, a, u) be the system of functional equations of Proposition^ In Appendix iBl 
the following formula for the mean is derived: 



(13) 



/i 



1 b T F M (x ,a ,l) 
x b T F a; (a;o,ao,l)' 



Here b T denotes a positive left eigenvector of I — F a , which is unique up to scaling. 
From the equality 



F(x, a, u) 



( *(e«°+"-+^-£^i^i(a,l)) ^ 
a; Pi (a, u) 
xP 2 (a, u) 

xP l (sl, u) 
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we get, after denoting 1^ with Pj 



(14) 



L 

Pl,ao " Pl.ai 



,a n H ha L _ p. a H ha_L _ p. \ 



\ Pl.hq Ph,a L ) 

Since a (x , 1) + • ■ ■ + a^^o, 1) = p(x , 1) = 1 we have XQe a o(xoA)+---aL(x ,i) = x Consequently 
the sum of all rows of F a equals (1,1,..., 1) for x — xq = 1/e. Thus, denoting the transpose of a 
vector v by i> T , the vector b T = (1, 1, . . . , 1) is the unique positive left eigenvector of I — F a , up 
to scaling. 

It is now easy to check that 

1 



x b T F x (x Q ,a , 1) 



-e 



and that 

1 L 

b T F u (x ,a ,l) = -2^PjVu(ao,l). 

6 3=1 

The fact that /x is a polynomial in 1/e is now a direct consequence from the fact that a consists 
of polynomials in 1/e and the fact that the coefficients are rational follows from the fact that 
F(x, a, u) has rational coefficients. □ 

Of course, with help of (|12|l we can easily evaluate /x directly. As already indicated it seems 
that there is no simple formula for a 2 . 

Before proving Proposition [3] we state in interesting fact that will be used in the sequel. 

Lemma 1. Let ao, ax, ■ ■ ■ , cll the partition of p that is used in the proof of Theorem^] Then 

det (I - F a (x, a, 1)) = 1 - xe a o+ai+-+a L _ 

Since the proof is a rather lengthy computation we postpone it to Appendix [C] 

Proposition 5. Let xq = 1/e and ao be given by Proposition^ Then a 2 (of Theorem^) is a 

polynomial in 1/e (with rational coefficients). 

Proof. From the proof of Proposition 01 we already know that x u (l) can be represented as a 
polynomial in 1/e (with rational coefficients). The next step is to show that a„(l) has the same 
property. For this purpose we have to look at the system (|3(J|) 

(I F a )a u F x x u -\- F U: 
— -D a a n = D x x u + D Ul 

where D(x, a, u) = det (I — F a (x, a, 1)) = 1 — X e a ° +ai+ '" +aL . We first observe that 

£> a Oo,ao,l) = (-1,-1,..., -1). 

Hence, we can replace the first row of the (L + 1) x (L + l)-matrix I — F a (that is redundant 
since the matrix has rank L) by the row (1, 1, ... , 1) and obtain a regular linear system for a M (l). 
Note that all entries of the right-hand side of this linear system can be represented as polynomials 
in 1/e. 

Let M(x,a) denote the matrix obtained from I — F a (x,a, 1) by replacing the first row by 
(1, 1, . . . , 1). If follows from the proof of LemmaQ] that det M (x, a) = 1. Further all entries of 
M(a;o,ao) can be represented as polynomials in 1/e. Thus, M(xo, ao) -1 has the same property 
and consequently a„(l) has this property, too. 

From that it directly follows from i|31|l that x uu is also represented as a polynomial in 1/e. (By 
definition, b{x, a, u) is a rational polynomial of the entries of I — F a .) 

With help of ll'ol) this finally leads to a representaion of a 2 as a polynomial in 1/e. □ 

This finally completes the proof of Theorem ^ 
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5. Extensions and Generalizations 



In what follows we list some obvious and some less obvious extensions of our main result. For 
the sake of conciseness we do not present the details. 

5.1. Several Patterns. Let Mi, ■ ■ ., Mk be k different patterns. Then the problem is to de- 
termine the joint (limiting) distribution of the number of occurrences of .Mi, . . ., Mk in trees of 
size n. Using the same techniques as above (introducing the forest of planted patterns deduced 
from the patterns) we again obtain a system of functional equations. The only difference is that 
we now have to count occurrences of Mi, . . ., Mk with different variables ui, . . . ,Uk, which is 
done in the same fashion as for a single u. In view of Theorem [21 multiple variables u make no 
difference and we obtain a multivariate Gaussian limiting distribution. 

5.2. Patterns Containing Paths of Unspecified Length. It might also be interesting to 
consider patterns where specific edges can be replaced by paths of arbitrary length. It turns out 
that this case in particular is more involved since a natural partition of all planted rooted trees is 
now infinite. Nevertheless it is possible to replace infinite series of such classes by one new class 
and end up with a finite system. Thus, this leads to a Gaussian limit law (as above). 

5.3. Filled and Empty Nodes. In our model we have distinguished between internal (filled) 
and external (empty) nodes of the pattern M , where the degrees of the internal (filled) nodes have 
to match exactly. It also seems to be possible to consider the following more general matching 
problem: Let M again be a finite tree, where certain nodes are "filled" and the remaining ones 
are "empty" . Now we say that M matches if it occurs as a subtree such that the corresponding 
degrees of the filled nodes are equal whereas the degrees of the empty nodes might be different. 
It seems that the counting procedure above can be adapted to cover this case, too. However, it 
is definitely more involved. For example, if leaves of the pattern are filled nodes then these nodes 
have to be leaves wherever the pattern occurs. This implies that some of the functions Oj (x, u) are 
then explicitly given in the system and the dependency graph is not strongly connected. However, 
it seems that this situation can be managed by eliminating these functions. Furthermore, and 
this is more serious, in general one has to consider infinitely many classes of trees leading to an 
infinite system of functional equations, in particular if an internal node is "empty" . In such a case 
Theorem El cannot be applied any more. Nevertheless we hope that the approach of Lalley |Lal| . 
that is applicable to infinite systems of functional equations in one variable, can be generalized to 
a corresponding generalization of Theorem [2] to proper infinite systems. Thus, we can expect a 
Gaussian limit law even in this case. 

In order to be more precise we will present an easy example. Let M denote the pattern depicted 
in Figure Here all nodes are empty. Thus, the corresponding pattern counting problem is a 
subgraph counting problem. 



We partition all planted trees according to their root degree. Let ak denote the set of planted 
rooted trees with root out-degree k and a,k(x,u) the correponding generating function (that also 
counts the number of subgraph occcurences of M). Further, let r(x, u) denote the generating 
function of rooted trees. Then we have 



o 




Figure 7. Example pattern with empty nodes 




(k > 0) 
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and 

r(x, u) - 

This system is easy to solve for u = 1. Here we have ak(x, 1) = xp(x) k /kl and r(x, 1) = p(x). By 
taking derivatives with respect to u and summing over all k we also get (after some algebra) 

r (n- -n- 5 p( x ) 7 i 1 p( x ) 8 , p( x ) 7 

Tu[X ' L> ~ 121-p(x) 61-p(») 6 ' 

This implies that the average value of pattern occurences (in this sense) is of the form (7/12)n + 
0(1), that is, /i = 7/12. In principle it is also possible to get asymptotics for higher moments but 
the calculations get more and more involved. 

5.4. Simply Generated Trees. Simply generated trees have been introduced by Meir and Moon 
MM^Sj and are proper generalizations of several types of rooted trees. Let 

(f(x) = fo + Lp\X + ip 2 x 2 ^ 

be a power series with non-negative coefficients; in particular we assume that ipa > and tpj > 
for some j > 2. We then define the weight u>(T) of a finite rooted tree T by 

-(T) = iK j(T) > 

j>0 

where Dj(T) denotes the number of nodes in T with j successors. If we set 



then the generating function 



satisfies the functional equation 



y n = 2^ lu{T) 

\T\=n 



n>l 



y(x) = xip(y(x)). 

In this context, y n denotes a weighted number of trees of size n. For example, if ifj = 1 for all 
j > (that is, if(x) — 1/(1 — x)) then all rooted trees have weight lu(T) — 1 and y n = p n is the 
number of planted plane trees. If ipj = 1/j! (that is, <p(x) — e x ) then we formally get labeled 
rooted trees, etc. 

Of course, we can proceed in the same way as above and obtain a system of functional equations 
that counts occurrences of a specific pattern in simply generated trees, and (under suitable condi- 
tions on the growth of ipj) we finally obtain a Gaussian limiting distribution. This has explicitly 
been done by Kok in his thesis Kok05a, Kok f)5b| . 

5.5. Unlabeled Trees. Let p n denote the number of unlabeled planted rooted trees and t n the 
number of unlabeled unrooted trees. The generating functions are denoted by 



p(x) = ~^^p n x n and i(x) = t n x n . 



n>l 



The structure of these trees is much more difficult than that of labeled trees. It turns out that 
one has to apply Polya's theory of counting and an amazing observation l|15|) by Otter |Ott48| . 
The generating functions p(x) and t(x) satisfy the functional equations 

p(x) = x^Z (S k :p(x),p(x 2 ), . . . ,p{x k )) = xexp (p(x) + \p{x 2 ) + \p{x 3 ) H ) 

fc >o V 1 6 y 

and 

(15) t(x)=p(x)-±p(x) 2 + ±p(x 2 ), 
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where Z(S^\ X\, . . . , x^) denotes the cycle index of the symmetric group S^- These functions have 
a common radius of convergence p ~ 0.338219 and a local expansion of the form 

p(x) =l-b(p- x) 1/2 + c( P -x) + d( P - xf /2 +0((p- x) 2 )) 

and 

*» = - b2 + 2 f V V - *) + Hp - x) 3 ' 2 + o((p- x) 2 )) , 

where b » 2.6811266 and c = b 2 /3 rj 2.3961466, and x = p is the only singularity on the circle of 
convergence |x| = p. Thus, they behave similarly as p{x) and t(x). We also get 

K^n-Vfl + OK 1 )) 

and 

L= b ^n-^p-«(l + 0(n-i)). 

Furthermore, it is possible to count the number of nodes of specific degree with the help of 
bivariate generating functions (compare with DG99 ). Thus, using Polya's theory of counting we 
can also obtain a system of functional equations for bivariate generating functions that count the 
number of occurrences of a specific pattern. The major difference to the procedure above is that 
this system also contains terms of the form a,j(x k ,u k ) for k > 2. Fortunately these terms can 
be considered as known functions when x varies around the singularity p and u varies around 1 
(compare again with |DG99| ). Hence, Theorem [21 applies again and we can proceed as above. This 
has explicitly been done by Kok in his thesis |Kok05al IKok05b| . 

5.6. Forests. First, let us consider the case of labeled trees with generating function t(x, u). Then 
the generating function f(x,u) of unlabeled forests is given by 

f(x,u) = e t ^. 

Thus, the singular behaviour of f(x,u) is the same as that of t(x,u) (compare with DG99 ) and 
consequently we again obtain a Gaussian limiting distribution for the number of occurrences of a 
specific pattern in labeled forests. 

The case of unlabeled forests is similar. Here we have 

f(x, u) = exp (t(x, u) + ^-i(x 2 .u 2 ) + ^-i(x 3 ,u 3 ) + ■ 

Of course, we can consider other classes of trees or forests of a given number of trees. 

5.7. Forbidden Patterns. It is also interesting to count the number t H} o of trees of size n without 
a given pattern. The generating function of these numbers is just p(x,0), resp. t(x,0). It is now 
an easy exercise to show that there exists an 7/ > such that 

The only thing we have to check is that the radius of convergence of t(x,0) is larger than the 
radius of convergence of t[x, 1). However, this is obvious since the radius of convergence of t{x, u) 
(which is the same as that of p{x,u)) is given by x(u) (for u around 1) and x'(l) < 0. 

Appendix A. Algorithms 

In the main part of this paper we showed that the limiting distribution of the number of pattern 
occurrences is normal with computable p and a 2 . However the family of classes {a , ax, ... , a^} 
considered in the first part was especially created to make the arguments more transparent, there 
were no considerations about minimality. In this appendix we focus on creating another partition 
A = {ao, . . . ,aL} of p which has considerably less classes. It also has the properties that it is 
recursively describable and allows an unambiguous definition of the number of additional occur- 
rences K (Zo, . . . , II) of the pattern. For example we show that for the pattern of FigureOHwe need 
just 8 equations whereas the previous proof would use more than 1000 equations. 
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First we remark that in some cases it is profitable to adjust the structure of the system of 
equations Q in Proposition ^ by allowing an additional polynomial Po(yo, . . . ,yi,,u) in the first 
equation. The first equation then becomes 

ao(x, u) = x ■ P (a (x,u), . . . ,a L (x,u),u) 

L 

+ ( a;e ao(x ) «)+...+a i (x,„) _ x P^aoix, «),..., a L (x, u), 1)). 

j=o 

This system still fits our analytical framework. The advantage is that for example the minimal 
system of equations for counting stars in trees on page [3] now fits this modified system. 

The idea for constructing A will be to create in a first time a certain family of tree classes 
S = {ti, . . . , t n }, not necessarily building a partition of p. Each of these classes will be defined 
as the class of all trees in p which "start" in a certain way, or with other words, which match a 
certain tree t\ at the root, just as was the case for the dj in the main part of this paper. By abuse 
of notation we will usually write ti instead of t\ for this tree. Let J = {1, . . . , n} and t\ = p\U. 
Now, by collecting in A all different, non-empty classes of the form 

(16) oj=n** n n **> i - j 

iei ie.J\i 

we will obtain a partition A of p. This partition will have a recursive description by construc- 
tion, see the algorithms below. Furthermore, if S is sufficiently rich, this partition will allow an 
unambiguous definition of K(Iq, . . . , II). 

We now make some considerations about the properties that S should possess to make sure 
that A will allow an unambiguous definition of K(Iq, . . . , II). Let 6 be a subclass of p. For each 
tree T £ p we can determine the number k(T) of pattern occurrences at the root of T. Let 
k(b) = {k(T) : T £ b}. Because the patterns have finitely many nodes and because in each 
internal node the degree is fixed and the root has to be part of the match, there are only finitely 
many ways for a pattern match. Thus the set k(b) will be finite and non-empty. Now let aj defined 
by equation (and non-empty). Now it holds that 

(17) fc(ai) c n k(u) n n m) 

iei i£j\I 

because a tree T in aj is by definition in ti, i £ I and t?, i £ J \ I, thus the number of pattern 
occurrences at the root is constrained by fc(ij), i £ I and fc(if), i £ J \ I. If S = {t±, . . . ,t n } is 
sufficiently rich, then k(ai) will only consist of a single number. This will be the case if for each 
to £ N, the family S contains all classes of trees "starting" with all possible arrangements of to 
overlapping patterns. Indeed, if we have for example for a certain tree class U that k(U) = {r, r+1}, 
then there will be another tree class tj, which is a subclass of t{ with k(tj) = {r + 1}. Now the 
intersections b = UDtj and c = tiDtj will yield tree classes with a singleton fc(.), namely k(b) = {r} 
and fc(c) = {r + 1}. 

For example consider a pattern which consists of a node of degree 2 attached to a node of 
degree 3. The corresponding planted patterns are shown in Figure |H1 Now let S consist of the 
three classes ti,t2,ta, shown in the center of Figure |SJ We have fc(ti) = {1}, because the left 
planted pattern surely matches and the other does not, kfa) = {1,2}, because the left planted 
pattern does not match and the right one matches at least once, but possibly twice. k(ts) = {2}, 
because the left pattern does not match and the right one surely matches twice. We see that 
the only non-empty intersections of the form arc a = t\ PI t| H b — t\ n t% fl t^ and 
c = ti n t% n £3. We obtain k(a) — k(b) — {1} and k(c) = {2}, which are all singletons. Because 
we also need a recursive description of the final partition A, we will construct some additional 
tree classes U. As the partition becomes finer when dealing with more classes ti, it is clear that k 
remains well-defined. 

On the other hand we do not have to associate a unique number to fe(dj), only to K(Iq, . . . , Zj,). 
Therefore we can slightly reduce the family S — {t\, . . . , t n }. In the algorithm below this reduction 
of S corresponds to considering only proper subtrees of the trees q £ Q {q itself is excluded) . 
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Figure 8. On the left: Planted patterns. Center: Classes U. Right: Classes 
{a, b, c}. The white box here means a node of out-degree different from 1. Note: 
this does not correspond to the output of the algorithms of this appendix 

A coarse-grain description of an algorithm now follows. 

(1) Calculate the set U of all planar embeddings of all planted patterns deducible from the 
pattern M. . 

(2) Consider the planted planar trees issue of step^as planar tree classes and take all possible 
intersections of any number of those classes. Now take the implied non-planar general tree 
structure of each class and collect these non-planar planted trees in the set Q. 

(3) Create a family S — {t\, . . . , t n } for the forest of planted subtrees of trees q £ Q, excluding 
the trees q themselves, where each tj has a recursive description into, ti, ... , tj-i and where 
to denotes a leaf. 

(4) Now interpret to as the class of all trees p and interpret the trees U 6 S as non-planar tree 
classes. Construct a partition A = {ao, . . . , a^} of the class of all planted trees p together 
with a recursive description (compare with l|16(l ). 

(5) Calculate for each term in the recursive description the number K(l , . . . , II) of additional 
pattern occurrences and deduce a system of equations for the generating functions a,j [x, u) 
of the classes aj . 

Before giving more detailed algorithms, we give an example. Consider the pattern of Figure 




FIGURE 9. Example pattern M 

With the procedure of the main part of the article we would end up with more than 1000 
classes, yielding a system of equations with the same number of equations. However, by using the 
following refined algorithm we only need 8 classes. 

In the first step we create all planar embeddings of the corresponding planted pattern (trees 
T ii T 2, T 3 of Figure This yields 3-2 + 2 + 4- 2 = 16 planar trees of which some are shown in 
Figure E3 




Figure 10. Some of in total 16 planted planar embeddings U 
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We now consider these structures as planar tree classes and additionally construct tree classes 
by taking all possible intersections of any number of the classes issued from step 1. Then, we take 
the non-planar implied tree structure of each planar class and collect these trees in Q. We end up 
with 24 different trees: 9 that stem from n, 1 from T2, and 14 from T3. Some of them are shown 
in Figure ITT1 




Figure 1 1 . Some of in total 24 non-planar trees of Q 



For all proper subtrees for each tree in Q we now construct a recursive description. For example, 
for the leftmost tree of Figure^Jwe first consider the subtree consisting of a node with four leaves. 
We denote this class by £4 = xt\. (Here we use the following structural notation: x denotes a 
root node, to a leaf and xty denotes a root to which are attached 4 leaves.) The next subtree is 
a root of out-degree 2 to which a subtree of type £4 is attached. We denote this with £5 — xtot4- 
Figure^lshows all 6 trees we end up with. Observe on our example that the collection of subtrees 
at the root extracted from the 24 trees in Q consists of only 6 trees. 



t 




t 3 










Mb 









Figure 12. Non-planar trees ti which possess a recursive description 



Their recursive descriptions are given by 



(18) ti = xt^, t2 = Xtotx, £3 = Xty, £4 = xty, £5 = Xt()t4, t$ = xt\. 

We now interpret to in (|18fl as the class of all planted trees p. The other ti are also interpreted 
as tree classes. For example, t\ is the class of all trees with root out-degree 3. We now construct 
a partition based on these classes and their recursive description of (|18|) . We obtain the classes of 
Figure EH 
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Figure 13. Non-planar partition classes. The white box means "not out-degree 
3 or 4" and the white triangle means "anything that is not contained in the other 
classes" 



Their recursive description is given by 



7 7 oo / 7 \ 1 

P \ ® a * = x © x ® a * © x ( a ° © ° 2 © a 3 © a 5 © a 6 © fl7) 2 © a; a, ; 

i=l i=0 n=5 \i=0 / 



(19) 









a 




a, 




i=l 




ai 


= xp 3 , 




a 2 


= xa\, 




03 


= xaicn, 




O4 


4 

= , 






= x(a © 


«2 


a 6 


— Xd^ , 




«7 


= a:(ao © 


«2 



The last step consists of determining the number of additional occurrences K(Iq, . . . , I7) for 
each term in the recursive description (|19fl and translating (|19|) in a system of equations for the 
generating functions a,j(x,u) = aj. As an example we consider the equation for a%. Class a\ 
consists of the trees of root out-degree 3. We get no additional occurrences of the pattern if we 
attach a tree of class ao, ai, a%, 04 or 05 to such a root, we get one additional occurrence for each 
tree of class 03 or 07 and we have two additional occurrences for each tree of class ciq attached to 
the root. This yields the equation for ai(x, u) below. Altogether we obtain: 



7 

\ " 



"1: •'• .1: a i + 2- T ( ao + a 2 + 03 + a5 + 06 + a 7 ) 2 + x ^2 ~\ ( a » 



ai = 

a-2 = 
«3 
04 

0,5 
ftfi 

a 7 



i=0 

— x{clq + ai + a 2 + a 4 + a 5 + (a 3 + ay)u + a 6 u 2 ) 3 , 

l' 2 

xaia^u, 

— x(ao + ai + a4 + a 6 + a 7 + (a 3 + a 5 )w + a 6 u 2 ) 4 , 

x(ao + 02 + «3 + a 5 + a 6 + 0-7)0.1, 
1 2 

2 2 ' a 4 7 



n>5 



\i=0 



a;(a + a 2 + a 3 + a 5 + a 6 + 07)04. 



We can now calculate \x. We get [i = 
feasible, because of memory problems. 5 



256-43e 



0.865759040 The computation of a 2 was not 



The actual computation uses polynomial expressions with more than 200,000 terms. We used Maple 9.5, which 
used up the memory of 1 GB and a very large part of the 1 GB swap. 
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A.l. Planar embedding algorithm: GeneralToPlanar. 

Input: a general planted tree r 

Output: the set U of planted planar trees 7r that share r as their implied general tree structure 
Algorithm: 

(1) write r in the form xr\ ■ ■ ■ Tk, that is, let k be the root out-degree of r and n, . . . , Tfe be 
the children at the root 

(2) for each i between 1 and k, recursively compute P$ = GeneralToPlanar (rj) 

(3) construct and return the set of planar trees xir^m ■ ■ ■ n^ff.) over all choices of 7Tj € Pj and 
over all permutations a of {1, . . . , k} 

A. 2. Tree class intersection algorithm. 

Input: a set of planted planar trees U 

Output: the set Q of non-planar planted trees which are obtained by intersecting planar tree classes 
based on U and collecting the non-planar tree structures of the resulting planar tree classes. 
Algorithm: 

(1) For each i between 1 and \U\, consider all i-tuples of different trees 7Ti, . . . ,7T, £ U and 
determine for each z-tuple if s = TTj H • • • PI iti may be interpreted as a non-empty tree class. 
In that case, let s' be the implied non-planar tree structure of s and add s' to the set Q. 

A. 3. DAGification algorithm. 

We construct a recursive description for the forest of planted subtrees for each tree in a given set 
of planted trees. Here we do not consider the tree itself as a subtree of itself. This calculation is 
reminiscent of the DAGification process of computer science (see, e.g., |ASU86j ). which aims at 
compacting an expression tree by sharing repeated subexpressions. However, if we interpret those 
subtrees as classes, the intersection of two classes need not be empty. 

Input: set of planted trees Q 

Output: a number to and a recursive description of the forest of planted subtrees S = {t\, . . . ,t m } 
of the trees of Q, of the form 

ti = xt x (i) ■ ■ ■ t x (i) (ri G N) for 1 < i < m 
with the constraint < i for all i and j 

Algorithm: 

(Initialization) Introduce the exceptional type to to denote the planted tree consisting of a single 

node (in other words, a leaf) and set to to 1 
(Main loop) For all planted trees of U perform a depth-first traversal of the tree, starting from 

the planted root; during this recursive calculation, at each node n: 

(1) if the node is a leaf, return the type to 

(2) else, recursively determine the type associated with each child of n 

(3) If n is a not the planted root of the tree, write the subtree rooted at n as a (commu- 
tative) product 7r = xt\ 1 ■ ■ ■ t\ r of the types obtained in the previous step 

(4) look up the uniquification table to check whether this product has already been 
assigned a type U 

(5) if not existent, increment to, create a new type t m , remember its definition t rn = 7r, 
and assign t m to the product ir in the uniquification table. 

(6) return the type ti if it was found by lookup, otherwise return t m 
(Conclusion) Return to and the sequence of definitions of the form ti = tt, for i = 1, 2, . . . , m. 
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A. 4. Disambiguating algorithm. 

The idea of the algorithm below is to consider each class of trees, ti, in turn, introducing its 
defining equation 

U = xt x w ■ ■ • i A w (r e N) 
into the calculation, while maintaining (and refining) a partition 

p = a © • • • © a L 

of the total class of planted trees. To be able to do so, it is crucial that the recursive equation 
for ti refers to classes tj with j < i only, starting with the special class t = p, the full class of 
planted trees. 

At any stage in the algorithm, the class of r-ary trees is given as the disjoint union of Cartesian 
products 

($)xt Xl ---t Xr where A = { A : £ (A) = r, < X, < L }, 

xeA 

where £{X) denotes the number of components in the tuple A. In the process of the algorithm 
below, each class ti gets represented in a "polynomial" form like above, summed over a subset A 
of the set of integer sequences A = (Ai,...,A r ) of a given length r. Computing intersections 
and differences of classes means merely computing intersections and differences of the A in their 
representations, because of the recursive structure of the input and of the algorithm itself. 

Input: 

• A family S = {t\, . . . , t m } of classes of trees with recursive descriptions of the form 

U = xt x (t) ■ ■ ■ t x (i) (r = for 1 < i < m 

(i) 

with the constraint < i for all i and j 

Output: 

• an integer L implying a partition 

p = a © • • • © a L 

• a representation of each ti of the form 

ti = dj for < i < m and J,C{0,...,L} 

j eh 

• a recursive description of the a» of the form 

a t = xa Xl ■ ■ ■ a Xe(x) for 1 < % < L, 
AeAi 

ao being implicitly described as p \ (a\ © • • • © at) 
Algorithm: 

(Initialization) Start with the trivial partition p = ao for L = 0, the single representation to — ao, 

that is, Io = {0}. 
(Main loop) For k from 1 to to do 

(1) replace each ti in the definition of tk with its current representation in terms of the dj, 
expand, and set s to the result, so as to get a representation of tk of the form 

s = xa Xl ■ ■ ■ a Xe{x) for some A^ 

AeA(>) 

(2) for i from 1 to L while s ^ do 

(a) set b to a* n s by setting A n to A* n A^ s ) 

(b) if b ^ 0, then do 

(i) set b' to aj \ s 

(ii) if b' £ 0, then 
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(A) create a new dj with description b': increment n before setting to 6', 
that is, before setting to Aj \ A^ 

(B) split a,i into di © dj, in the representations of the tj, that is, add n into 
each set Ij containing i 

(C) split ai into at © a/, in the descriptions of the <Zj, &, and s, that is, for 
each sequence in each of the Aj, A n , and A'*), add sequences with i 
replaced by L when the sequence involves i (if i occurs more than once, 
then replace i by i or L in all possible ways) 

(D) set a,i to b by setting Aj to A n 

(iii) set s to s \ b, which is also s \ and update A^ by setting it to A^ s ^ \ A; 

(3) if s ^ 0, then 

(a) create a new dj with description s: increment L before setting to s, that is, 
before setting A^ to A^ s ' 

(b) split dg into do © a/, in the representations of the tj, that is, add L into each 
set Ij containing 

(c) split do into do © in the descriptions of the dj , that is, for each sequence in 
each of the Aj , add sequences with replaced by n when the sequence involves 
(if occurs more than once, then replace by or L in all possible ways) 

(4) represent tk as the union of all those a^s that have contributed a non-empty b at 
step l|2b|l and of <z_l if a new dj was created at step (13 all . that is, create the corre- 
sponding set Ik consisting of the contributing is, together with L if relevant 

(Final step) Return L, the representations of the ti for 1 < i < m, the recursive descriptions of 
the di for 1 < i < L 

We will explicitly show the stages through which the algorithm goes when running with the 
input H18f) . For readability, we will keep expressions in factored form. 

k = 1: from t\ = xa 3 ), we derive t\ = d\ and d\ — x(do © di) 3 . 

k = 2: from t 2 = x(do © ai)ai, we derive t\ = ax, t 2 = a 2 and d\ = xp 3 , 02 = xpdi, where 
p = d © di © a 2 . 

= 3: from t 3 = xd\, we derive ii = ax, £2 = 12 © 03, £3 = 02 and a\ — xp 3 , d2 = xa\, 03 = 
x(d © a 2 © 03)01, where p = a © a± © 02 © a 3 . 

fc = 4: from £4 = x(ao © ai © 02 © as) 4 , we derive ii = ai, t 2 — d 2 © 03, £3 = 02, £4 = 04 and 
ai = xp 3 , d2 = xa\, 03 = x(aoffia2ffia3©a 4 )ai, 04 = xp 4 , where p = aoffiai©a 2 ffia3©a4. 

k = 5: from £ 5 = x(a © d\ © 02 © 03 © 04)04, we derive ti — ax, t 2 = a 2 © a 3 © a 5 , t 3 = 02, £4 = 
04, t 5 — d3 © a6 and ai = xp 3 , 02 = xd\, a 3 = xaia4, 04 = xp 4 , a 5 = x(ao © 02 © 03 © 
a5©ci6)ai, a 6 = x(aoffia 2 ffia3©a4©a5ffia 6 )a4, where p = aoffiaiffia 2 ©a3©a4©a5©a6. 

k — 6: fromi 6 = iraf, we derived = ax, t 2 = a2©a3©a5, £3 = d 2 , £4 = 04, t 5 = a 3 ©a6©a7, t 6 = 
do and ai = xp 3 , a 2 = xa^, a 3 = xaia4, a 4 = xp 4 , a 5 = x(a © a 2 © 03 © 05 © do © 
07)01, a 6 = xa|, 07 = x(a © d 2 © 03 © 05 © a6 © 07)04, where p = ao © d\ © 02 © a 3 © 
d4 © a 5 © do © 07. 

A. 5. Calculation of AT(?o, . . . , Zl): CountRootOccurrences. 

Input: non-planar planted trees r, Tx, . ■■, Tk 

Output: the number of occurrences of any of the Ti at the root of r 
Algorithm: 

(1) fix one element n' from GeneralToPlanar(r) (see algorithm lA.lf) 

(2) for each i between 1 and k, compute Pi — GeneralToPlanar(r) 

(3) count and return the number of pairs (7r, , 7r') such that 7Tj is element of Pi and m occurs 
at the root of ti' 

As an example we calculate K(0, 1, 0, 1, 0, 0, 1, 0). This corresponds to calculating the number 
of additional occurrences in the class xaia3a6. The input trees r, Tx, t 2 , T3 are shown in Figure ITU 
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Here r corresponds to the class xaia^a^ and ti,T2,t 3 correspond to the three possible ways of 
planting the example pattern. 




Figure 14. Input trees r, t\ , t 2 , r 3 

We take as fixed planar embedding 71"' of r the embedding of Figure [21 We now iterate over 
the different planar embeddings tt\ of ti (6 of them), TT2 of r 2 (2 of them), and 7T3 of T3 (8 of 
them), and determine for each Tii (i £ {1,2,3}) whether it occurs at the root of it'. Consider 
for example the four embeddings shown in Figure 1101 (three embeddings of n , one embedding 
of T3). The leftmost embedding matches it' , the one next to it as well. The third one does 
not match 7r', because the node with out-degree four is in the wrong position. The rightmost 
embedding clearly does not match either. By considering all embeddings and counting the matches 
we get k = K(0, 1, 0, 1, 0, 0, 1, 0) = 3. 

The algorithm calculates the correct value of k, because the partition consisting of the classes a* 
is sufficiently fine. From this follows that every match above of a planar embedding really gives 
rise to exactly one additional pattern occurrence. See the considerations made at the beginning 
of this appendix. 

By now the transformation to a systems of equations is easy. We get the terms by replacing a 
term xaj 1 ■ ■ ■ dj s in the recursive description of aj by a term xyj 1 ■ ■ ■ y^u K ^ l °'- - ,lL " > . . . Il~ Here 
it is assumed that terms that represent the same tree classes (like xa\a2 and xa2a\) are identified 
before. It is clear that there are only finitely many terms for which K(Iq, . . . , l£) might be non-zero 
a priori. 

Appendix B. Asymptotics of Analytic Systems 

The following theorem is a slightly modified version of the main theorem from |Drm 97 . We 
denote the transpose of a vector v by v T . Let F(x, y, u) = (a w (x, y, u), . . . , F N (x, y, u)) T be a 
column vector of functions Fj(x, y, u), 1 < j ' < N, with complex variables x, y = (yi, . . . , 2/at) t , 
u = (ui, . . . ,Uk) T which are analytic around and satisfy F 3 (0,O, 0) = for 1 < j < N. We 
are interested in the analytic solution y = y(x, u) = (yi(x, u), . . . ,yjv(^ 7 u)) T of the functional 
equation 

(20) y = F(x,y,u) 

with y(0,O) = 0, i.e., we demand that the (unknown) functions yj — yj(x,u), 1 < j < N, satisfy 
the system of functional equations 

yi = F 1 (x,y 1 ,y 2 , ■ ■ .,y N ,u), 
y 2 = F 2 (x,yx,y 2 , ■ ■ .,y N ,u), 

Vn = F N (x,yi,y 2) ■ ■ .,y N) u). 

It is convenient to define the notion of a dependency (di)graph Gp — (V, E) for such a system 
of functional equations y = F(x, y, u). The vertices V — {y\, 1/2, • ■ ■ , 2/ at} are just the unknown 
functions and an ordered pair (y i; yj) is contained in the edge set E if and only if Fj,(x, y, u) really 
depends on yj. 

If the functions Fj(x, y, u) have non-negative Taylor coefficients then it is easy to see that the 
solutions yj(x, u) have the same property. (One only has to solve the system iteratively by setting 
y (x, u) = and y i+1 (x,u) = F(a;,yi(x,u),u) for i > 0. The limit y(x, u) = lim i _ >00 y<(a;, u) is 
the (unique) solution of the system above.) 
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Now suppose that G(x, y, u) is another analytic function with non-negative Taylor coefficients. 
Then G(x, y(x, u), u) has a power series expansion 

G(x, y(x, u), u) = c n,m£"u m 

n,m 

with non-negative coefficients c n m . In fact, we assume that for every n > Hq there exists m such 
that c n!m > 0. 

Let X n (n > no) denote an TV-dimensional discrete random vector with 

(21) Pr[X„ = m] := 
where 

Cn — ^ Cn , m 
m 

are the coefficients of 

G(x,y(x,l),l) = ^c„x n . 

n>0 

The following theorem shows that (under suitable analyticity conditions) X n has a Gaussian 
limiting distribution. 

Theorem 2. Let F(x,y, u) = (a\(x, y, u), . . . , Fn(x, y, u)) T be functions analytic aroundx = 0, 
y = (j/i, . . . , j/at) t = 0, u = (ui,...,Ufe) T = 0, whose Taylor coefficients are all non-negative, 
such that F(0,y, u) = 0, F(x, 0,u) ^ 0, F x (x, y, u) ^ 0, and such that there exists j with 
F ViV Jx,y,u) 7^ 0. Furthermore assume that the region of convergence of F is large enough that 
there exists a non-negative solution x = xq, y = yo of the system of equations 

y = F(x,y,l), 

= det(I-F y (z,y,l)), 

inside it. Let 

y = y(x,u) = {yi(x,u), . . .,y N (x,u)) T 
denote the analytic solutions of the system 

(22) y = F(x,y,u) 

with y(0, u) = and assume that d n j > (1 < j < N) for n > m, where yj(x, 1) = X)n>o d> n ,jX n ■ 
Moreover, let G(x, y, u) denote an analytic function with non-negative Taylor coefficients such that 
the point (xo, y(xo, 1), 1) is contained in the region of convergence. Finally, let random vectors X„ 
(n > no) be defined by \21}) . 

If the dependency graph Gf — {V, E) of the system in the unknown functions y\(x, u), . . . , 
Dn(x, u) is strongly connected then the sequence of random vectors X„ admits a Gaussian limiting 
distribution with mean value 

EX n = /ifi + 0(l) (n^oo) 

and covariance matrix 

Cov(X„,X n ) = En + 0(1) (n -►«>). 
The row vector fj, is given by 

_ x u {l) 
"~ X{1) ' 

and the matrix E by 

(23) s = -^H + ^ + diag(^), 

x(l) 

where x = x(u) ( and y — y(u) = y(x(u), u) ) is the solution of the ( extended) system 

(24) y = F(z,y,u), 

(25) = det(I-F y (x,y,u)). 
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The proof of Theorem [21 is exactly the same as that given in |Drm97| . The main observation 
is that the assumptions above show that the solutions Uj[x, u) admit a local representation of the 
form 



Vj 0, u ) = 9j {x,u)~ hj (x, u) Jl 



x(u) 

(where u is close to 1 and x close to xq = x(l)). The assumption that the dependency graph is 
strongly connected ensures that the location of the singularity of all functions yj (x, u) is determined 
by the common function x(u). Thus, we get the same property for G(x,y(x, u), u): 



(26) G(x,y(x,u),u) = g(x,u) - h(x,u)Jl 



x u 



It is then well known (see |BR83llDrm94| ) that a square-root singularity plus some minor conditions 
implies asymptotic normality of the coefficients (in the sense introduced above) with mean and 
covariance expressed in terms of derivatives of x(u). Note, for example, that the assumption 
d n .j > for n > n\ ensures that c n > for sufficiently large n and from this follows that 
x = x(l) is the only singularity on the radius of convergence of G(x, y(x, 1), 1). 

In what follows we comment on the evaluation of fi and X. The problem is to extract the 
derivatives of x(u). The function x(u) is the solution of the system (|24H25() and is exactly the 
location of the singularity of the mapping x i— > y(x, u) when u is fixed (and close to 1). 

Let x(u) and y(u) = y(x(u), u) denote the solutions of l|24H25|l . Then we have 

(27) y(u) = F(x(u),y(u),u). 
Taking derivatives with respect to u we get 

(28) y u (u) = F x (x(u), y(u), u)x u (u) + F y (x(u), y(u), u)y u (u) + P«(x(u), y(u), u), 

where the three terms in F denote evaluations at (a;(u), y(u), u) of the partial derivatives of F, 
and where x u and y u denote the Jacobian of x resp. y with respect to u. In particular, for u = 1 
we have x(l) = xq and y(l) = yo and, of course 

det(I-F y (a:o,yo,l))=0. 

Since F y is a non-negative matrix and the dependency graph is strongly connected there is a 
unique Perron- Frobenius eigenvalue of multiplicity 1. Here this eigenvalue equals 1. Thus, I — F y 
has rank N — 1 and has (up to scaling) a unique positive left eigenvector b T : 

b T (I-F y (x ,y ,l)) -0. 

From (|2"%)) we obtain 

(I - F y (a;o, y , l))y u (l) = F x (a;o, y , l)x u (l) + F u (x , yo, 1). 
By multiplying b T from the left we thus get 

(29) b T F,(x , y , l)x„ + b T F u (x , y , 1) = 
and consequently 

1 b T F u (x ,y ,l) 



x b T F a; (a;o,yo,l) 



The derivation of S is more involved. We first define b(x, y, u) as the (generalized) vector 
product 6 of the N — 1 last columns of the matrix I — F y (x, y, u). Observe that 

D(x,y,u) := (b T (x,y,u) (I - F y (x,y,u))) 1 = det (I - F y (x,y, u)) . 

In particular we have 

D(x(u),y(u),u) =0. 



^More precisely this is the wedge product combined with the Hodge duality. 
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Then from 

(I-F y )y u = F x x u + F u , 
(30) -DyYu = D x x u + D u 

we can calculate y u . (The first system has rank N — 1, this means that we can skip the first 
equation. This reduced system is then completed to a regular system by appending the second 
equation IpHljl .) 
We now set 

di(u) = di(x(u),y(u),u) = b(x(u), y(u), u) T F x (x(u), y(u), u) 
d 2 (u) = d 2 (a;(u), y(u), u) = b(x(u), y(u), u) T F u (ir(u), y(u), u). 
By differentiating equation l|29l) we get 

/ s _ (di x x u + d ly y u + di u )x u + (d 2x x u + d 2y y u + d 2u ) 

«i 

where di x , di y , di u , d 2a; , d 2y , d 2u denote the respective partial derivatives and where we omitted 
the dependence on u. With the knowledge of xq, yo and y u (l) we can now evaluate :r uu at u = 1 
and we finally calculate £ from l|23[l. 

Appendix C. Proof of Lemma ^ 

In this appendix we will prove Lemma ^ saying that the determinant det (I — F a (x, a, 1)) is 
given by 

det (I - F a {x, a, I)) = 1 - xe a °+^+-+ aL . 

We first observe that the sum of all rows of I — F a (x, a, 1) equals 

(l — xe a °^ ai ^ — ^ aL 1 — ie" + ai ^ ^ aL 1 — xe a °^ ai ^ — ^ a -t) 

compare with Q14fl, Hence, we get 

det (I - F a (a;, a, 1)) = (1 - X e a ° +ai+ - +at ) det M(x, a), 

where M(x, a) denotes the matrix I — F a where we replace the first row by (1, 1, .... 1). Thus, it 
remains to prove that detM(x, a) = 1. 

For this purpose we have to be more explicit with the partition A = {ao, ai, ■ ■ ■ , a^}. More 
precisely we construct A recursively from level to level. This procedure is similar to that of Propo- 
sition[3]but not the same. In order to make our arguments more transparent we restrict ourselves to 
4 steps. Note that this procedure also provides a recursive description of the polynomials Pj(a, 1). 

One starts with Ao = {do,di}, where do = ao and d\ = p\ao- This means that do collects 
all trees where the root out-degree is not contained in D and d\ those where it is contained in D. 
For example, if D = {2} then the generating functions of this (trivial) partition are given by 
di(x, 1) = xp(x) 2 /2 and by do(x, 1) = p(x) — di(x, 1) = p(x) — xp(x) 2 /2. 

Then we partition d\ according to structure of the subtrees of the root, where we distinguish 
between the previous classes do and d\. We get A\ = {co, c%, . . . , c m }, where c = d and ci © . . .© 
c m = di. In particular, if D — {2} then m = 3, the class c\ collects all trees with root out-degree 2 
where both subtrees of the root are in class ao = d , c 2 collects all trees with with root out-degree 2 
where one subtree of the root is in class ao = do and the other one in class d\ , and c 3 collects those 
trees where both subtrees of the root are in class d\ . The corresponding generating functions are 
given by ci(ir,l) = xdo(x, 1) /2, by c 2 (a;,l) = xdo{x, l)di(a;, 1), and by c^{x, 1) ~ xdi(x, l) 2 /2. 
Of course, we also have co(x, 1) = do(x, 1) and c\(x, 1) + c 2 (x, 1) + c^{x, 1) = d\(x, 1). 

In the same fashion we proceed further. We partition c s (1 < s < m) according to the 
structure of the subtrees of the root (that are now taken from {ci, . . . , c m }) and denote them by 
■A.2 = {bo,bi, . . . ,be}. Further we define sets C s by c s = © rgC b r . If D = {2} then bo = c , 
bi = ci, c 2 is divided into three parts, and C3 is divided into 6 parts: C\ — {1}, C 2 = {2,3,4}, 
C 3 = {5,6,7,8,9,10}. 7 



^By the way this leads to the partition that is used in the proof of Theorem^resp. of Proposition HI 
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Finally, we partition bj (j > 1) according according to the structure of the subtrees of the root 
that are taken from the hi and denote them by A = {ao,ai, . . . ,ol}. As in the previous step 
we define sets B r by b r = jeBj aj. In general we have to iterate this procedure until a certain 
level and get almost the same partition as in the proof of Proposition ^ The only difference is 
that at the lowest level we only distinguish between nodes with degree in D and degree not in D. 
However this is no real restriction as we can extend the partition above with an additional level 
and we will have a well-defined number of additional occurrences for each class. We again obtain 
a partition which fits Proposition ^ 

We recall that this recursive procedure directly provides a recursive description of the system 
of functional equations. In particular we have 

ctj(x, 1) = xPj(a {x, l),ai(x, 1), . . . ,a L (x, 1), 1), 

where Pj{-, 1) can be actually written as a polynomial in b\, . . . , bg. 
Next 

b r (x, 1) = xQ r {b (x 1 l),h(x, 1), . . . ,h(x, 1), 1), 
where Q r (', 1) can be written as a polynomial in Co, ci, . . . , c m . Further, 

Qr — Pj- 

In other words, the sum X^gb Pj can ^ e written as polynomial in c r . 
Finally, 

c s (x, 1) = xR s (c (x, 1),ci(x, 1), . . . ,c m (x, 1)), 
where R s (-, 1) can be written as a polynomial in do = ao and d\ = a\ + • • • + oll and we have 

R s = Q r - 

reC s 

Let G(x,a) denote the L x L-submatrix of F a where we omit the first row and column. Then 
G(x,a) has the following structure: 



G(a,a) 

where 
and 




G s ' s " — ( B T i r i> 



Gmm 



Br ' r " - \ xP h"j )ieB r ,,j£B r „ ■ 

The condition that Pj can be written as a polynomial in bj implies that Pi. aj — Pi. aj . for all 
Jij J2 S B r ii , that is, each row of B r i r n is either zero or all entries are the same. 
Further, if we fix r' and sum over all rows i € B r i then we get 

^ ^ xPi,aj — xQ r ' . 

Since Q r i can be written as a polynomial in c s (0 < s < to) we have Qr'.a^ — Qr',a j2 for all 

ji,j2 S C s ", where we set C s = [J B r . 

rec B 

Similarly if we fix s' and sum over all rows i € C s i then we get 



^ ^ xPi^aj — xR s f _ a j 



Since R s > can be written as a polynomial in do = ao and d\ = ai+- • ■+cll we have R s > <aj — R s > , ai 
for all 1 < ji, j 2 < L. 
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Now we will calculate the determinant of the matrix 

/ 1 1 1 \ / 

I ••• 

M(x,a) = 



V 

/ 1 I--- 
x I -Gn 



x G n 



I / V x G ml 
...1 \ 

— G\ r 



\ 

Glrn 
Gmm J 



\ 



—G m \ 



I-Gr, 



J 



(By x we denote an entry we do not care.) We now perform the following row operations. For 
every s' = 1, . . . , m we substitute the first row of 



( 



-G s /i 



I — G s 's' 



G s 'm ) 



by the sum of the corresponding rows z G C s >. Since Rs', ajl — Rs>,a j2 for all 1 < ji, j 2 < L this 
sum of the rows has the form 

( x xR s ' a • xR s i a • 1 xR s i a 1 xR s i a • • • xR s ' a • xR s * a ^ 

We now add the very first row (that equals (1,1,..., 1)) xR s ^ a times to this row and obtain 

w s , = ( x | • • • | • • • | 1 • • • 1 | • • • | • • • ) 

Next we fix s' and r' such that r' G C s > and substitute the first row of 

(x (-Br'j)jeCi ••• (I • S rlj - B rlj ) jeCs , ■ ■ ■ {-B r >j)jec m ) 

by the sum of the rows i G B r >. Since for every s" it holds that Q r ', ajl — Qr',a j2 for all ji, J2 G C s " 
this sum has the following form 



( x {-xQ r ,. aj ) 



(S r >j — %Qr',aj)jeC e , "■( x Qr':a j )jeC m ) 



where 5 r ij — 1 if and only if j G B r > and = otherwise. This means, for every s" ^ s' the entries 
(— a;Q r ',oj)j£C // are either all equal or if s" — s' then we have to add 1 at proper positions. For 
every s" we now add row w s " xQ r ' taj times. If s" ^ s' then we get a zero block (0, . . . ,0). If 



s' we get a block of the form 



1---1 



0---0 ) . 



( 0---0 • 
This means that this row is replaced by 

w a /, r / = ( x | 0---0 | ••• | 0---0 | 0---0---1---1---0---0 | 0---0 | ••• | 0---0) 

With help of these rows we can eliminate all further entries of M(x, a) that come from G(x, a). 
(Here we use the fact that each row of B r > r » is either zero or all entries are the same.) This means 
that we finally end up with a matrix of the form 

/ 1 



H 



1... 



■1 



where H*> s 



V x H ml ■■■ 

for s' s" and H s > s > is of the form 

( J K K 
J 



\ 



\ 



K \ 




J J 
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with 



/ 1 





1 \ 




and K 



/111 




1 \ 





\00 ... 1 J \ ... / 

It is now an easy task to transform the matrix (-ff s 's")i<s'.s"<m (with help of row transforms) 
to the identity matrix. Furthermore we can transform the very first row (1,1,..., 1) of H to 
(1, 0, . . . , 0) and end up with a matrix of the form 

/ 1 ••• \ 

x 1 



V x 1 ) 

Obviously, this matrix has determinant 1 . Since the above row transforms do not change the value 
of the determinant we, thus, obtain detM(a;,a) = 1. 

Acknowledgement. The authors want to thank Philippe Flajolet for several discussions on the 
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